Control Strategies for a Stochastic Planner

نویسندگان

  • Jonathan Tash
  • Stuart J. Russell
چکیده

We present new algorithms for local planning over Markov decision processes. The base-level algorithm possesses several interesting features for control of computation, based on selecting computations according to their expected benefit to decision quality. The algorithms are shown to expand the agent’s knowledge where the world warrants it, with appropriate responsiveness to time pressure and randomness. We then develop an introspective algorithm, using an internal representation of what computational work has already been done. This strategy extends the agent’s knowledge base where warranted by the agent’s world model and the agent’s knowledge of the work already put into various parts of this model. It also enables the agent to act so as to take advantage of the computational savings inherent in staying in known parts of the state space. The control flexibility provided by this strategy, by incorporating natural problem-solving methods, directs computational effort towards where it’s needed better than previous approaches, providing greater hopes for scalability to large domains.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Competitive supply of durable goods under stochastic fluctuation in stock

This paper presents a theoretical model in which the stock growth rate of durable goods has stochastic fluctuation over time. It concludes that a social planner increases the expected percentage rate of production since uncertainty increases the user cost from consumer’s point view.

متن کامل

Stochastic cooperative advertising in a manufacturer–retailer decentralized supply channel

This work considers cooperative advertising in a manufacturer–retailer supply chain. While the manufacturer is the Stackelberg leader, the retailer is the follower. Using Sethi model it models the dynamic effect of the manufacturer and retailer’s advertising efforts on sale. It uses optimal control technique and stochastic differential game theory to obtain the players’ advertising strategies a...

متن کامل

Cost-Effective Sensing during Plan Execution

Between sensing the world after every action (as in a reactive plan) and not sensing at all (as in an openloop plan), lies a continuum of strategies for sensing during plan execution. If sensing incurs a cost (in time or resources), the most cost-effective strategy is likely to fall somewhere between these two extremes. Yet most work on plan execution assumes one or the other. In this paper, an...

متن کامل

Insurer Optimal Asset Allocation in a Small and Closed Economy: The Case of Iran’s Social Security Organization

We seek to determine the optimal amount of the insurer’s investment in all types of assets for a small and closed economy. The goal is to detect the implications and contributions the risk seeker and risk aversion insurer commonly make and the effectiveness in the investment decision. Also, finding the optimum portfolio for each is the main goal of the present study. To this end, we adopted the...

متن کامل

Approximation Strategies for Routing in Stochastic Dynamic Networks

In this work, we study a special semi-Markov decision process that formalizes a route-planning problem in stochastic transportation networks. We explore two versions of the planning problem: one in which the planner knows the initial traffic situation but does not have access to further information once it begins executing the plan (open-loop), and one in which the planner receives continuous t...

متن کامل

Planning Robust Motion Strategies for a Mobile Robot

This paper reports on a recent work we have conducted concerning the development and the implementation of a robust motion planner for a mobile robot in a polygonal environment and in presence of uncertainty in robot control and sensing. Such a planner takes explicitly into account the uncertainty in robot control and produces a robust motion plan composed of sensor-based motion commands. The m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994